. FROM i\ 2 741' 9292/ (WEO)OEC 8 2004 1 7 : 58/ST. 17:55/NO. 631 9784483 P 19 

Ally. Docket No. YOR9-2000-0168US 1 

(590.014) 

REMARKS 

In the Office Action dated September 8, 2004, pending Claims 1-27 were rejected 
and the rejection made final. In response Applicants have filed herewith a Request for 
Continued Examination and have amended independent Claims 1, 14, and 27. These 
amendments are not in acquiesce of the Office's position on allowability of the claims, 
but are made merely to expedite prosecution; no change in the scope of the claims is 
intended by Applicants. 

Applicants and the undersigned are most grateful for the time and effort accorded 
the instant application by the Examiner. On November 18, 2004, Applicants' counsel and 
one of the inventors, Jiri Navratil, conducted a telephone interview with the Examiner in 
which the present application, the Goldcnthal et al. reference, and the Newman et al. 
reference were discussed. No agreement, however, was reached with respect to the 
claims of the present application. 

The Office is respectfully requested to reconsider the rejections presented in the 
outstanding Office Action in light of the following remarks. 

The disclosure continues to be objected to because of a number of asserted 
informalities. Equation 1 on Page 8 has been amended to correct a minor typographical 
error. Thus, it is submitted this objection should be withdrawn. 

The specification also continues to be objected to as failing to provide proper 
antecedent basis for the claimed subject matter, specifically the term of "non-interpolated 
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likelihood value" appearing in Claims 1, 14 and 27. Applicant respectfully submit diis 
objection is improper and should be withdrawn as - contrary to the assertion by the 
Office - this term is in fact understood by one of ordinary skill in art. A Non-Interpolated 
Likelihood Value is defined as the complement/opposite to an Interpolated Likelihood 
Value, which in turn is obtained in the present application from the likelihood function 
(of the hierarchical speaker model) as a weighted sum of individual values calculated on 
the various levels and in various units of the hierarchical model An example of a non- 
interpolated likelihood value is the (single) maximum likelihood value which can be 
determined by calculating likelihoods on all levels and units of the hierarchical model and 
taking the maximum value. 

In order to expedite prosecution, however, the claims have been amended to recite ^ 
"likelihood value" rather than 44 non-interpolated likelihood value". Support for the term 
"likelihood value** may be found throughout the specification. ( Sec c,g., Page 12, lines 
12-15; "Using the complete unit ensemble provided by the model, a scoring method then 
assigns the best matching likelihood to each feature vector frame and thus maximizes the 
resulting model score." and Page 9, lines 10-14: "In a "pickmax" technique in accordance 
with an embodiment of the present invention (step 209), the likelihood score S for each of 
the structured models mentioned above is calculated as the average of the likelihoods of 
the T feature vectors which, in turn, are obtained as the maximum likelihoods computed 
over all units and all levels of the given speaker's structured model ("pickmax").") 
Accordingly, Applicants respectfully submit this objection has been obviated. 
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Claims 1-3, 6-12, 14-16, 19-25 and 27 stand rejected under 35 USC 103(a) over 
Goldenthal et al. in view of Newman et al. Reconsideration and withdrawal of the 
present rejection is hereby respectfully requested. 

The present invention broadly contemplates, in accordance with at least one 
presently preferred embodiment, the calculation of scores in such a way that the total 
likelihood is a weighted sum of the likelihood of all phonetic units at all levels of 
phonetic granularity (model grains), and that the weights are derived in such a way that 
the determination of the robustness and significance of the individual model grains is 
approached with emphasis. (Page 2, line 16 - Page 3, line 4) Given a structured model 
M(iJ) for a speaker with 1 < i < L levels of detail and with 1 £ j £ K(i) units on the i-th 
level, the score (as log-probability) for the utterance is calculated in each level separately* 
whereby explicit labeling information is used to identify the corresponding phonetic unit 
that is to be used on each level. (Page 8, lines 5-8) As discussed in the specification, the 
number of units on each level and the number of levels may vary across speakers, since 
there might be less data available from certain speakers, entailing the necessity of 
omitting certain units altogether. (Page 10, lines 7-9) 

As presently best understood, Goldenthal appears to be directed to a two-stage 
cohort selection technique used to reduce the equal error rate of a speaker verification 
process which validates the claimed identity of an unknown speaker. (Col. 2, lines 61-65) 
First a the digitized signals of an unknown speaker seeking verification are compared 
with acoustic models corresponding to the claimed identification to determine "claimed" 
log likelihood scores. (Col, 5 T lines 18-22) Then the same testing signals are compared 
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with all of the cohort models to determine cohort log likelihood scores and then a smaller 
subset of cohort scores is dynamically selected. (Col. 5, lines 23-30) The claimed scores 
and the dynamically selected scores are then presented to a validator which determines 
whether or not a threshold difference between the two scores is present, (CoL 5, lines 31- 
46) There is, however, no teaching or suggestion that the models used in calculating the 
various log likelihood scores have multiple levels of detail or that scores are calculated 
for each level of detail. 

As presently best understood, Newman appears to be directed producing a speech 
model for use in determining whether a speaker associated with the speech model 
produced an unidentified speech sample. The speech model is produced without using an 
external mechanism to monitor the accuracy with which the contents were identified. 
(CoL 1, line 65 - CoL 2, line 1 1) As noted in the Office Action, each model of a word 
may be represented by a set of phonemes that represent the phonetic spelling of a word. 
Furthermore, each phoneme may be represented by three sets of model parameters that 
correspond to the three nodes of the phoneme. (Col. 6, lines 26-29) This is not, however, 
having multiple levels of phonetic detail in accordance with the present invention. 

A 35 U.S.C. 103(a) rejection requires that the combined cited references provide 
both the motivation to combine the references and an expectation of success. There is, 
however, absolutely no teaching or suggestion in Newman that would lead one of 
ordinary skill in the art to modify Goldenthal to arrive at the present invention. 
Moreover, actually combining the teachings of Goldenthal and Newman would not result 
in the in the present invention which requires specifically "providing a model 
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corresponding to a target speaker, the mode] being resolved into at least one frame and 
capable of having a plurality of levels of phonetic detail of varying resolution for each 
frame" and "determining, for each frame and each level of phonetic detail of the 
target speaker model, a likelihood value; and resolving ihe at least one likelihood value 
to obtain a likelihood score." (Claim 1; emphasis added) Similar language appears in the 
other independent claims. This hierarchical approach in which there are a plurality of 
levels of phonetic detail with varying resolution, determining a likelihood value for each 
level, and then using the likelihood values to determine a likelihood score is simply not 
taught or suggested by either Goldenthal or Newman. 

Applicants acknowledge that Claims 4-5, 13, 17-18 and 26 were indicated by the 
Examiner as being allowable if rewritten in independent form. Applicants reserve the 
right to file new claims of such scope at a later date that would still, at that point, 
presumably be allowable. 

In view of the foregoing, it is respectfully submitted that Claims 1,14 and 27 fully 
distinguish over the applied art and is thus are in condition for allowance. By virtue of 
dependence from what are believed to be allowable independent Claims 1 and 14, it is 
respectfully submitted that Claims 2-13 and 15-26 are also presently allowable. 

In summary, it is respectfully submitted that the instant application, including 
Claims 1-27, is presently in condition for allowance. Notice to the effect is hereby 
earnestly solicited. In the unlikely event, however, it appears the claims will not be 
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allowed, the Office is invited to call the undersigned to discuss the claims prior to the 
issuance of a further Office Action. 



Respectfully submitted, 
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(412) 741-8400 

(412) 741-9292 - Facsimile 

Attorneys for Applicants 
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